Fir filter tap architecture for highly dense layout

ABSTRACT

An area-efficient finite impulse response filter having permuted bit-order functional elements that provide substantially straight and direct interconnects with minimized length between adjacent elements. A functional element is coupled with an input data path and an output data path, at least one of which has a permuted bit-order data path exhibiting bit-order ordinal discontinuity. The permuted bit-order data path also can be a transposed permuted bit-order data path in which the placement of at least part of a data path is transposed, relative to prior art placements. The bit-order ordinal discontinuity fosters short, straight element interconnects which leads to increased spatial efficiency and improved performance.

CROSS-REFERENCE TO RELATED APPLICATION(S)

This application claims priority from provisional application No.60/180,579 filed Feb. 4, 2000, which is incorporated herein in itsentirety by this reference.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention generally relates to area-efficient digitalelements, particularly digital filters, and more particularly, toapparatus and methods for improving spatial efficiency and relatedperformance in finite impulse response (FIR) filters in a high speedcommunication system.

2. State of the Art

Applications employing high-performance digital signal processing (DSP)techniques are becoming ubiquitous. The implementations of suchhigh-performance digital signal processors, were carried out at theexpense of high power consumption and heat dissipation, large die size,and die cost. Many consumer uses, such as personal computers, massInternet working, portable communication devices, and the like, becamefeasible as sub-micron VLSI techniques evolved, and were incorporatedinto DSP processing components. As a result, very high performance DSPdevices and systems can be offered in much smaller packages withsubstantial reductions in power requirements, heat dissipation, cost,and so forth. However, the demand for components in systems havingever-higher speed and precision needs coupled with ever-decreasing costrequirements is unabated. One of the means by which these needs can bemet is to integrate what was once a multi-board “system” into amonolithic VLSI chip. Such integration requires many constituentcomputational modules to be laid out in a compact, dense, efficient chiparchitecture. A key factor in developing an efficient layout isminimized interconnect routing between the functional elements andmodules within a processor.

Typically, as the level of device integration increases, interconnectlength can become the one of the most significant factors in determiningVLSI system performance. For example, the propagation speed of a signalthrough an interconnect is dependent upon its length, partly due to thecontribution of length to the inherent resistance and capacitance of theinterconnect. For global interconnects spanning significant distances ofan IC, the effect can be on the order of the square of the interconnectlength. On the other hand, local, intra-element interconnects tend toexperience propagation delays that are roughly proportional to theinterconnect length. Scaling also tends to decrease the powerdissipation per gate, which diminishes the ability of a gate to drivethe capacitance of an interconnect. Thus, even if a local interconnectis relatively short, inessential length can be highly undesirable.

Even if the impact of an individual, local interconnect may be deemedrelatively modest, the vast numbers of local interconnects distributedthroughout a highly-integrated, high-speed VLSI system can, in theaggregate, have a significant cumulative reduction in systemperformance. Thus, interconnect length can be the dominating factor indetermining circuit performance. Indeed, despite the benefits derivedfrom deep sub-micron device scaling, it can be difficult to take fulladvantage of the higher switching speeds inherent in scaled devices whenthe propagation of signals throughout the IC is impaired by relativelylong, indirect interconnects.

Existing approaches intended to effect short metal interconnect routestypically result in irregular structures that do not lend themselves tovery compact and dense layouts. Conversely, architectural approachesintended to provide a compact, dense layout usually result in relativelylong, indirect interconnect lengths, and can require additional metallayers to implement functional element interconnection.

What is needed is a digital element architecture that realizes a verycompact and dense layout using functional elements having regularstructure and using direct, minimal interconnects between adjacentfunctional elements.

SUMMARY OF THE INVENTION

The present invention satisfies the above need by providing a digitalelement that employs a permuted bit-order functional element, whichfunctional element is adapted to perform a preselected function. Thefunctional element is coupled to the input and output data paths of thedigital element, and is disposed such that the bit locations of the datapaths are arranged in a predetermined bit-order sequence. The functionalelement can be adapted to provide a permuted bit-order sequence onselected bit locations of the input data path, or the output data path,or both. The permuted bit-order sequence exhibits a predeterminedbit-order ordinal discontinuity, which effects substantially straightand direct interconnects, having minimized length, between adjacentstructures, thus creating a very compact and dense functional modulelayout. The functional element also can be adapted to provide atransposed permuted bit-order sequence on selected bit locations of theinput data path, or the output data path, or both. The transposedpermuted bit-order sequence exhibits a predetermined bit-order ordinaldiscontinuity in combination with at least a portion of the data pathbeing transposed, relative to customary data path layouts. A functionalelement having a transposed permuted bit-order effects substantiallystraight and direct interconnects, having minimized length, betweenadjacent structures, thus creating a very compact and dense functionalmodule layout.

The constituent components of one embodiment of a functional elementaccording to the present invention can include, for example, multipleprimitive logic units such as an AND device, an OR device, an XORdevice, a NAND device, a NOR device, a NEXOR device, an inverter, orcombinations thereof, disposed such that the resultant functionalelement is an electronic device having multiple bit locations arrangedin other than ordinal numerical order. Another embodiment of afunctional element according to the present invention can includecombinations of these elementary functional elements, which performessential arithmetic, logic, and switching functions. Such embodimentsof functional elements can include, without limitation, an accumulator,a multiplier, a divider, an adder, a counter, a shifter, a decoder, acontroller, a multiplexer, a storage (e.g., memory) element, a logicarray, and combinations thereof.

Preferred embodiments of the present invention contemplate a permutedbit-order accumulator, as well as a permuted bit-order accumulatorcoupled with a multiplier, using an interconnect that is substantiallydirect and of minimized length. Additional preferred embodiments of thepresent invention contemplate a transposed permuted bit-orderaccumulator, as well as a permuted bit-order accumulator coupled with amultiplier, using an interconnect that is substantially direct and ofminimized length. A FIR filter tap module of the present invention caninclude such an accumulator and multiplier, such that an area-efficientFIR filter results therefrom. Indeed, in one preferred embodiment of theinvention herein, an integrated multiplier/accumulator (MAC), isprovided.

Furthermore, yet other embodiments of the invention herein includefunctional elements having even greater complexity, so that thefunctional elements of the invention herein comprehend a completehierarchy of components, devices, subsystems, and systems including,without limitation, arithmetic logic units; computational modules;processors, including without limitation, general-purpose and digitalsignal processors; filter taps modules; FIR filters, including adirect-transposed FIR filter; transceivers, including withoutlimitation, gigabit Ethernet transceivers; and communications systems.

BRIEF DESCRIPTION OF THE DRAWINGS

These and other features, aspects and advantages of the presentinvention will be more fully understood when considered with respect tothe following detailed description, appended claims and accompanyingdrawings, wherein:

FIG. 1 is a data flow diagram of an exemplary linear transversal filter;

FIG. 2A is an illustration of a exemplary module floor plan having priorart elements therein;

FIG. 2B is an illustration of another exemplary module floor plan,having prior art components therein;

FIG. 2C is an illustration of an exemplary module floor plan, havingfunctional elements according to an embodiment of the invention herein;

FIG. 2D is an illustration of exemplary module floor plan, havingfunctional elements according to another embodiment of the inventionherein;

FIG. 3 is a block diagram of a prior art FIR filter tap module floorplan, having prior art components therein;

FIG. 4A is a block diagram of a FIR filter tap module floor plan, havingfunctional elements disposed with a permuted bit order sequence,according to an embodiment of the invention herein;

FIG. 4B is a block diagram of a FIR filter tap module floor plan, havingfunctional elements disposed with a transposed permuted bit ordersequence, according to the invention herein;

FIG. 4C is a block diagram of the FIR filter tap module floor plan ofFIG. 4A, having functional elements disposed with permuted bit orderinput and output sequences, according to an embodiment of the inventionherein;

FIG. 4D is a block diagram of the FIR filter tap module floor plan ofFIG. 4B, having functional elements disposed with transposed permutedbit order input and output sequences according to the invention herein;

FIG. 4E is a block diagram of a digital element having functionalelements disposed with a permuted bit order output sequence according tothe invention herein;

FIG. 4F is a block diagram of a digital element having functionalelements disposed with permuted bit order input and output sequences,according to the invention herein; and

FIG. 5 is a block diagram of a FIR filter according to the presentinvention, as it is disposed within a gigabit Ethernet transceiver in acommunication system.

DETAILED DESCRIPTION OF THE INVENTION

As will be understood by one having skill in the art, most communicationsystems include digital signal processing (DSP) devices, which arecomposed of a large number of fundamental components. Such componentsare repetitively and extensively employed throughout many DSP devices,including, for example, FIR filters. Components which themselves areinefficiently laid out tend to replicate the spatial inefficiency, whichultimately leads to a large VLSI device die size having wasted die areaand reduced performance. Furthermore, an inefficient layout having long,angulate interconnect lines can adversely impact the severe power andtiming limitations of such high-performance devices. A reduction in thedie area consumed by a single component can represent a significantsavings in overall die size, due to the replication of the efficiencyfound within the improved component. Advantageously, such efficienciesin layout typically provide an improvement in device performance, aswell.

The present invention is directed to a functional element, the bitlocations of which are disposed to provide substantially straight anddirect, minimum length interconnects between the functional element andadjacent functional elements, such that the resultant element, device,subsystem, or system, exhibits a dense, efficient layout. As usedherein, a “functional element” (or “digital element”) can be any elementthat performs a desired logic function, regardless of the inherentfunctional complexity. A digital element can include a functionalelement, and can have input data paths for receiving data, and outputdata paths for transmitting data after it has been transformed, ordelayed, by the functional element. Each functional element canimplement a data path using multiple data bit locations on its input andoutput paths. It is preferred that at least one of the data paths havebit locations arranged in a predetermined bit-order sequence other thanan ordinal numerical order. The functional element locations are coupledto the respective digital element input and output data paths. Prior artfunctional elements are disposed such that the bit-order sequence of theinput and output data paths follow a canonical numerical sequence havingcontinuous ordinality e.g., from most significant bit to leastsignificant bit or vice versa. Functional elements according to thepresent invention are not so constrained.

In general, functional elements according to the present invention canbe synthesized by repetitively and selectively interconnecting multipleprimitive logic units, for example, AND gates, OR gates, and inverters,and combinations thereof, to provide the desired preselected function.In turn, primitive functional elements can be selectively combined tocreate more complex digital elements, which themselves may perform moreabstract arithmetic, logic, and switching functions than the functionsof their constituent elements. A digital element according to thepresent invention can include, without limitation, an accumulator, amultiplier, a divider, an adder, a counter, a shifter, a decoder, acontroller, a multiplexer, a storage (e.g., memory) element, a logicarray, and combinations thereof.

Additionally, both a “functional element” and a “digital element” cancomprehend sophisticated, composite components, devices, and systems,including without limitation, arithmetic logic units, general purposeprocessors, digital filters, including constituent filter taps, anddigital signal processors, any of which can be used as a building blockin a more complex device, subsystem, or system. Indeed, a digitalelement according to the present invention can be any electronic devicehaving multiple bit locations. The electronic device can include,without limitation, a sequential circuit, combinational circuit, or acombination thereof.

Because a complex digital elements can include many constituent digitaland functional elements and components joined by potentially extensiveinterconnects, it is desirable that constituent elements providesubstantially straight and direct, minimum length interconnects betweenadjacent functional elements and components, such that the resultantelement, device, subsystem, or system exhibits a dense, efficientlayout.

In general, each of the input and output data paths of a functionalelement include multiple bit locations, which are arranged in abit-order sequence having ordinal continuity relative to thebit-significance. For example, a prior art 10-bit data path having suchdata bit-order with continuous ordinality, an exemplary sequence wouldbe <0 1 2 3 4 5 6 7 8 9>, with each digit representing a particular databit which may be of increasing significance. Another exemplary sequencewould include <9 8 7 6 5 4 3 2 1 0>, in which data bit 9 is followedseriatim by the lesser significant bits. Although more complex digitalelements may have much longer bit sequences, any digital element havinga permuted bit-order sequence, or a transposed permuted bit-ordersequence, on two or more bit locations, is contemplated by the presentinvention.

A functional element according to the present invention differs from itscorresponding canonical bit-ordered form by having the element'sbit-order selectively permuted in order to effect substantially straightand direct data paths, or interconnects, having minimized length,between adjacent functional elements. A more complex digital elementcomposed of multiple functional elements, essential digital elements, orboth, likewise could benefit from a permuted bit-order sequence or atransposed permuted bit-order sequence, on input data paths, output datapaths, any intermediate data paths.

As used herein, a “permuted bit-order” functional element, or digitalelement, includes any functional or digital element, as defined above,with a data path having a predetermined bit-order ordinal discontinuity.For example, the data path (input, output, or both) of the functionalelement can be disposed to provide at least two bit locations or datasubpaths, whose bit-order is re-grouped such that the length of theinterconnect between the permuted bit-order functional element, and anadjacent functional element, is substantially straight and direct, withminimized length. As a result, the adjacent structures, together, occupya smaller portion of die area than would previously be possible using acanonical bit-order having ordinal continuity. A permuted bit-ordersequence can be employed on an input data path, and output data path, oran intermediate interconnect that couples adjacent devices.

Similarly, a “transposed permuted bit-order” functional element, ordigital element, includes any functional or digital element, as definedabove, with a data path having a predetermined bit-order ordinaldiscontinuity, with at least a portion of the data path beingtransposed, relative to a customary bit-order sequence. For example, thedata path (input, output, or both) of the functional element can bedisposed to provide at least two bit locations or data subpaths, whosebit-order is re-grouped such that the length of the interconnect betweenthe permuted bit-order functional element, and an adjacent component orfunctional element, is substantially straight and direct, with minimizedlength. As a result, the adjacent structures, together, occupy a smallerportion of die area than would previously be possible using a canonicalbit-order having ordinal continuity. Where feasible to employ, it ispossible that a transposed permuted bit-order functional element canprovide additional spatial efficiencies in device layout. A transposedpermuted bit-order sequence can be employed on an input data path, andoutput data path, or an intermediate interconnect that couples adjacentdevices.

Hereafter, but solely for the purposes of exposition, it will be usefulto describe the present invention in terms of an exemplary lineartransversal (FIR) filter. LMS adaptive filters are ubiquitous, and thuswell known in the art. Skilled artisans would recognize that the presentinvention is not limited to adaptive filters, or even to digitalcommunication systems, however.

FIG. 1 illustrates a canonical-form FIR filter which, here, is anine-tap transversal LMS adaptive equalizer 5. Such a filter generallyincludes a combination of three basic functional elements: storagedevices, multipliers, and adders, although other functional elements maybe used. These basic functional elements are combined to createcomposite functional and digital elements, which themselves arerepetitively employed through the FIR filter architecture. Thefunctional and digital elements in filter section 10 are coupled bynumerous interconnects, each of which are composed of routing wires, thenumber of which being equivalent to the bit width of the data beingcommunicated between the elements. In a DSP system where time isseverely budgeted, the time cost of multiple arithmetic operations,accomplished seriatim, can be significant, making any avoidable delayhighly undesirable. Thus, it is advantageous to minimize the length ofinterconnects wherever possible.

LMS transversal filter section 10 includes multiple filter tap modules11 a-i, which themselves are composite functional elements composed ofthe storage elements 27 a-h, 34 a-i, multipliers 30 a-i, 36 a-i; andadders 32 a-i, 38 a-i. Furthermore, accumulators 35 a-i are compositefunctional elements that can include storage elements 34 a-i and adders32 a-i. It is apparent from FIG. 1 that equalizer 5 is created by therepetitive selective placement of functional elements having preselectedfunctions. Inefficiencies in the interconnects of the primitive, basic,or composite functional elements in equalizer 5 can rapidly lead toinefficiencies in the layout of and the die area of the device,subsystem, or system of which equalizer 5 is a part.

FIGS. 2A and 2B illustrate typical interconnect inefficiencies that areimposed by prior art components 205, 210, 215, 220, of functional module200 and components 230, 235, 240, 245, of functional module 225,respectively. The respective bit locations of components 205-220 inmodule 200 and 230-245 in module 225 are arranged with simplicity of theinternal device structure in mind, and not with regard to providingshort, straight and direct interconnects between adjacent components. Inmany high-performance devices, particularly where such prior artcomponents are repetitively applied within the device architecture, thissimplistic internal structure can create the undesirable effects ofinefficient layout, large die size, increased power consumption andcapacitance, reduced throughput, and the like.

In the particular example illustrated in FIG. 2A, the input and outputdata paths of each of components 205, 210, 215, 220 have bit locationswhich are arranged in bit-order sequence having ordinal continuity.Thus, the canonical ordinally-continuous bit-order sequence for thenineteen-bit output data path 212 of component 210 would be <0 1 2 3 4 56 7 8 9 10 11 12 13 14 15 16 17 18> or vice versa. In the exemplarydesign depicted by FIG. 2A, it is desired to interconnect the mostsignificant ten bit locations (Bits 18-9) of output data path 212 ofcomponent 210 with the bit locations of the ten-bit input data path 214of component 215, using interconnect 213. By forcing interconnect 213 tobe short and straight, the length, L1, of module 200 is minimized.However, the short interconnect 213 imposes the expense of increasingthe width, W1, and thus the required die area (L1×W1) of module 200.

In FIG. 2B, it also is desired to interconnect the most significant tenbit locations (Bits 18-9) of the output data path 236 of component 235with the bit locations of the ten-bit input data path 238 of component240, using interconnect 237. In this example, module width, W2, isminimized by allowing longer, indirect, and angulate interconnects 237between adjacent components 235 and 240. Although the width, W2, ofmodule 225 is reduced with this design, relative to width W1 of module200, the longer, indirect interconnects 237 tend to extended modulelength, L2, thereby increasing the required die area (L2×W2) of module225. As in FIG. 2A, the bit-order sequence of each component 230, 235,240, 245 has ordinal continuity enforced.

By contrast, module 250 in FIG. 2C, minimizes both module length, L3,and module width, W3, by employing functional elements having a permutedbit-order sequence on the bit locations of the input data path, theoutput data path, or both. In the particular example in FIG. 2C,functional element 260 uses an input data path 259 having six input datapath 258 (i.e., bits 0-8) and an output data path 261 having nineteenoutput data bit path 262. For clarity, the first nine output data bitpaths of element 260 are not shown. Similar to the design of FIG. 2B,the design of module 250 in FIG. 2C, complete the most significant tenoutput data bit path 262 of functional element 260 with the ten inputdata bit path 264 of functional element 265. However, in accordance withthe present invention, functional element 260 is disposed to provide apermuted bit-order sequence, in this case on the output data path of theelement. The illustrated permuted bit-order sequence exhibits ordinaldiscontinuity such that the predetermined data path bit-order does notobserve a strict ordinal succession from most significant data bit toleast significant data bit, or vice-versa. Instead, the data pathbit-order is permuted such that the bit-order sequence providesinterconnects 263, which are substantially straight, direct, and ofminimized length.

In this particular example, the output data path 262 of functionalelement 260 has an ordinal discontinuity spanning two data subpaths,with first data subpath (not shown) including the nine least significantbits (i.e., Bits 0-8), and second data subpath including the ten mostsignificant bits (i.e., Bits 9-18). The bit-order permutation of thisexample is created by disposing the bit locations in functional element260 such that order of the bits on second data subpath is “reversed”relative to the usual ordinal sequence of elements bits. Thus the databit paths of functional element 260 follow the sequence <0 1 2 3 4 5 6 78 18 17 16 15 14 13 12 11 10 9>, with the ordinal discontinuityoccurring after Bit 8. The permuted bit-order second data subpath 262then is coupled to input data path 264 of adjacent functional element265 via short, direct interconnects 263. As a result, the required diearea (L3×W3)of module 250 is reduced.

In FIG. 2D, another preferred embodiment of a permuted bit-orderfunctional element 285 is illustrated within module 275. Althoughpermuted bit-order functional elements were described above in terms ofa permuted bit order relative to an input, an output, or both, element285 discloses a transposed permuted bit-order functional element 285 inwhich input portion 284 of module is transposed relative to the outputportion 286, such that input data path 283 enters element 285 in adefined spatial relationship with output data path 287, represented inFIG. 2D as a vertical spatial relationship. An advantage of thetransposed permuted bit order of functional element 285 is that element280 may be placed in a vertical relationship with functional element290, thereby increasing the spatial efficiency of module 275 by furtherreducing the length, L4, of module 275. In yet another embodiment of atransposed permuted bit order functional element 285, it is possible tointegrate element 280 directly into the input portion 284 of functionalelement 285, thus affording even greater spatial efficiencies, whereelements 280 and 285 are amenable to such integration. As a result, therequired die area (L4×W4)of module 275 is reduced.

A skilled artisan now would recognize that a permuted bit-orderfunctional, or digital, element can have a permuted bit-order sequenceon the output data path, the input data path, or both. Also, the datapath of a permuted bit-order functional, or digital, element can havetwo or more data subpaths with two or more ordinal discontinuities inthe data bit-order sequence, as may be useful to effect additionalshort, direct interconnects with other elements. Furthermore, where thebit locations of both input and output data paths are selectivelypermuted, a different bit sequence permutation can be employed on eachof the input data path and the output data path, depending upon theefficiency goals intended by the permutation. Moreover, by providing atransposed permuted bit order sequence, and by placing the input datapath in a defined transport spatial relationship with the output datapath, a functional element yielding additional efficiencies, includingsubstantial spatial efficiencies, can be devised.

FIG. 3 illustrates an exemplary prior art digital filter tap module 300that includes first multiplier 310, ripple carry accumulator 320, secondmultiplier 330, and adder 340. One design goal of prior art module 300is to interconnect constituent components 310, 320, 330, 340 to effectminimal routing therebetween, as implemented by interconnects usingrelatively short, straight metal routing lines 312, 322, 332. Therelative efficiency gained by the short, straight metal interconnects ofpresent design technologies comes at the expense of a relativelyexpansive, irregular device layout. This is similar to the disadvantagesillustrated in the example of FIG. 2A. Other present design techniquesprovide regular structures but at the expense of relatively longinterconnect lengths, and/or the need for additional metal layers,similar to the disadvantages described with regard to FIG. 2B.

Heretofore, whether in the design of short metal routes, or in thedesign of a compact structure, constituent components 310, 320, 330,340, were provided as devices having bit locations arranged insequential bit order exhibiting ordinal continuity, i.e., using acanonical numerical order. Canonically-ordered components can hinder thedesign of module architectures in which it is most desirable to create acompact, regular structure having constituent components coupled bysubstantially straight interconnects of minimized length. Preferredembodiments of the present invention can effect such desirable modulearchitectures.

FIG. 4A illustrates one embodiment of the present invention, this timein the form of FIR filter tap module 400. Tap module 400 includes,solely for the purpose of this example, first multiplier 410,accumulator 420, second multiplier 430, and adder 440. In FIG. 4A,accumulator 420 is arranged such that both the design goals of having acompact, regular structure using substantially straight and directinterconnects with minimized length are effected. A skilled artisanwould realize that the invention herein is not limited to accumulators,such as accumulator 420, or even to modules such as FIR filter tapmodule 400, but can be employed in numerous basic and compositefunctional elements, alone, or in combination with other functionalelements, including, without limitation, adders, multipliers, storageelements. More complex digital elements are within the scope of thepresent invention, including, for example, an arithmetic unit, a generalpurpose processor, a DSP processor, a gigabit Ethernet system, acommunication system, and the like.

In the particular preferred embodiment of the invention shown in FIG.4A, accumulator 420 can be a ripple carry accumulator of 19-bit length,with permuted bit-order output data subpaths, such that first outputdata subpath 422 implements nine lower order bit locations (i.e., Bits0-8), and second output data subpath 424 implements ten upper order bitlocations (i.e., Bits 9-18). As a result of employing permuted bit-orderaccumulator 420, interconnect 427 between accumulator 420 and adjacentmultiplier 430 can be accomplished by substantially direct, short metallines. Exemplary accumulator 420 is arranged with input and output onopposing sides such that data signals generally propagateunidirectionally, e.g., from left to right. Unlike accumulator 320 inFIG. 3, in which enforcement of bit-order ordinal continuity leads to anirregular, and relatively larger, configuration for module 300, permutedbit-order accumulator 420 can have its data subpaths 422, 424 arrangedto effect substantially straight and direct interconnects 427 havingminimized length. Using the permuted bit-order functional elements ofthe present invention, the layout of constituent components 410, 420,430, 440 of filter tap module 400 lend itself to a compact, regularstructure that minimizes consumed die area, element interconnectlengths, power consumption, and device latency. Here, the output datapath is arranged to be opposite the input data path, relative to thelongitudinal axis of element 420. Bit-order discontinuity in exemplaryaccumulator 420 is provided on the right side, or output data path,although discontinuity also could provide a permuted bit-order on theinput data path, or both.

In FIG. 4B, another embodiment of a FIR filter tap module 450 isillustrated, which includes, solely for the purpose of this example,first multiplier 455, accumulator 460, second multiplier 470, and adder480. In FIG. 4B, accumulator 460 is arranged to have a transposedpermuted bit order, similar to functional element 285 in FIG. 2D. Datais received from component 455 by functional element input section 456via input data path 454, and is transmitted to component 470 fromfunctional element output section 458 via output data path 467. Incontrast to the accumulator 420 in FIG. 4A, element 460 in FIG. 4B isarranged such that input data path 454 into input section 456 isdisposed on the same side as output data path 464 of output section 458.Relative to the placement of input data path of element 420 in FIG. 4A,input data path 454 of element 460 in FIG. 4B is transposed, relative toa customary input layout, such as in FIG. 4A, for the purpose ofproviding additional spatial efficiencies in the layout of filter tapmodule 450. Thus, element 450 is said to have a “transposed permuted bitorder sequence.”

By employing a transposed permuted bit order design in accumulator 460,multiplier 455 can be placed in a defined spatial relationship withsecond multiplier 470, and make use of otherwise underutilized devicearea. The transposed permuted bit order of accumulator 460 can lead to asubstantial increase in the spatial efficiency of tap module 450 overprior art modules. Furthermore, due to the spatial efficiencies affordedby the transposed permuted bit order of accumulator 460, it is possibleto integrate multiplier with accumulator 460, to provide a transposedpermuted bit-order multiplier-accumulator (MAC) 490, which can yieldeven greater spatial efficiencies in the layout of tap module 450.

In addition to being a advantageously implemented in a tap module, afunctional element, as exemplified by accumulator 460, having a permutedbit-order sequence, or a transposed bit order sequence, also can beadvantageously implemented in a computational module, and in aprocessor, including a general purpose processor and a digital signalprocessor. Indeed, a functional element according to the presentinvention can be devised and adapted to perform a preselected functionin virtually any digital environment.

Referring now to FIG. 4C, there is illustrated a block diagram of theFIR filter tap module 400 floor plan of FIG. 4A, having functionalelements disposed with permuted bit order input and output sequences,according to an embodiment of the invention herein. More specifically,the accumulator 420 has a bit-order discontinuity on the output datapath (e.g., data subpath 424), as well as a bit-order discontinuity onthe input data path 418. The input data path 418 implements six bitlocations in the accumulator 420, which receive data from the multiplier410. The six bit locations of the input data path 418 contain six bitsin a permuted bit order.

Referring now to FIG. 4D, there is illustrated a block diagram of theFIR filter tap module 450 floor plan of FIG. 4B, having functionalelements disposed with transposed permuted bit order input and outputsequences, according to the invention herein. More specifically, theaccumulator 460 has a bit-order discontinuity on the output data path467 (e.g., data subpath 464), as well as a bit-order discontinuity onthe input data path 454. The input data path 454 implements six bitlocations, which receive data from the multiplier 455. The six bitlocations of the input data path 454 contain bits in a permuted bitorder.

FIG. 4E is a block diagram of a digital element 480 having functionalelements disposed with a permuted bit order output sequence, accordingto the invention herein. More specifically, the digital element 480comprises one or more functional elements, for example, functionalelement 1 (481), functional element 2 (482) and functional element n(483). The input data path 484 may be transposed on the same side as theoutput data path 486, i.e. input data path 485. The input data bits 487(and input data bits 489 if the input data path is transposed) may be ofcontinuous ordinality, i.e., the bit order is not permuted. However, theoutput data bits may comprise a permuted bit order sequence 488. Aspreviously described, a digital element 480 may comprise one or morefunctional elements, where each functional element may comprise an ANDdevice, an OR device, an XOR device, a NAND device, a NOR device, aNEXOR device, an inverter, an accumulator, a multiplier, a divider, anadder, a counter, a shifter, a decoder, a controller, a multiplexer, astorage element, a logic array, and combinations thereof. FIG. 4F is ablock diagram of a digital element 490 having functional elementsdisposed with permuted bit order input and output sequences, accordingto the invention herein. More specifically, the digital element 490comprises one or more functional elements, for example, functionalelement 1 (491), functional element 2 (492) and functional element n(493). The input data path 494 may be transposed on the same side as theoutput data path 496, i.e. input data path 495. The input data bits 497(and input data bits 499 if the input data path is transposed) may be ofpermuted bit order. In addition, the output data bits may also comprisea permuted bit order sequence 498. As previously described, a digitalelement 490 may comprise one or more functional elements, where eachfunctional element may comprise an AND device, an OR device, an XORdevice, a NAND device, a NOR device, a NEXOR device, an inverter, anaccumulator, a multiplier, a divider, an adder, a counter, a shifter, adecoder, a controller, a multiplexer, a storage element, a logic array,and combinations thereof.

In FIG. 5, another preferred embodiment of the present invention in theform of FIR filter 500 is illustrated, in which the delay elements 510a-h are selectively distributed between the filter input data path 502,and the output data path 504. Structural and operational details of FIRfilter 500 are further discussed in U.S. Pat. No. 6,272,173 issued onAug. 7 2001 entitled “EFFICIENT FIR FILTER FOR HIGH-SPEEDCOMMUNICATION”, having the same inventor and being assigned to the sameassignee hereof, and which hereby is incorporated by reference in itsentirety herein. FIR filter 500 can be a direct-transposed FIR filter,which can be a constituent element of a gigabit Ethernet transceiver 550in communication system 580.

In FIG. 5, filter 500 is an 8^(th) order LMS adaptive FIR filter having9-tap elements 520 a-i, with 8 delay structures 510 a-h therebetween.The overall architecture of the FIR filter in FIG. 5 differs from thecanonical FIR filter architecture in FIG. 1, in that delay elements 510a-h are not located exclusively on the input path 43 as in thedirect-form filter 10, but are placed on both input path 502 and outputpath 504. As explained in U.S. patent application Ser. No. 09/437,722,this distribution of delay elements helps the filter designer balancedevice element size and power consumption against filter operatingspeed.

By employing permuted bit-order functional elements in the design offilter 500, for example, accumulator 420 and tap module 400 in FIG. 4,additional efficiencies in the die area, operating characteristics, andperformance of filter 500 can be realized. Furthermore, in view of themodular and hierarchical nature of permuted bit-order functionalelements, the aforementioned efficiencies afforded by the presentinvention can be advantageously used in the design of many types ofdevices, subsystems, and systems. Thus, as illustrated in FIG. 5, filter500 devised according to the present invention can be a constituentcomponent of gigabit Ethernet transceiver 550. Another embodiment of thepresent invention contemplates transceiver 550 coupled to a partnertransceiver 560 communicating over channel 570, together constitutingcommunication system 580.

Many alterations and modifications may be made by those having ordinaryskill in the art without departing from the spirit and scope of theinvention. Therefore, it must be understood that the illustratedembodiments have been set forth only for the purposes of example, andthat it should not be taken as limiting the invention as defined by thefollowing claims. The following claims are, therefore, to be read toinclude not only the combination of elements which are literally setforth but all equivalent elements for performing substantially the samefunction in substantially the same way to obtain substantially the sameresult. The claims are thus to be understood to include what isspecifically illustrated and described above, what is conceptuallyequivalent, and also what incorporates the essential idea of theinvention.

What is claimed is:
 1. A digital element, comprising: a. an input datapath having a plurality of input bit locations arranged in apredetermined input bit-order sequence; b. an output data path having aplurality of output bit locations arranged in a predetermined outputbit-order sequence; c. a first functional element adapted to perform afirst preselected function, and coupled to the input data path; d. asecond functional element adapted to perform a second preselectedfunction, and coupled to the output data path; and e. an intermediateinterconnect coupling the first functional element to the secondfunctional element, the first functional element being adapted toprovide output bit locations having a permuted bit-order sequence, suchthat the intermediate interconnect between the first functional elementand the second functional element is rendered substantially direct andof minimized length, wherein one of the first and second functionalelements comprises one of an accumulator, a multiplier, a divider, anadder, a counter, a shifter a decoder a controller, a multiplexer, alogic array, a storage element, and a combination thereof, and whereinanother one of the first and second functional elements comprises adigital filter tap.
 2. The digital element of claim 1, wherein the oneof the first and second functional elements comprises a finite impulseresponse (FIR) filter.
 3. The digital element of claim 2, wherein theFIR filter is operably disposed in a gigabit Ethernet transceiver. 4.The digital element of claim 2, wherein the finite impulse responsefilter comprises a direct-transposed FIR filter.
 5. The digital elementof claim 4, wherein the FIR filter is operably disposed in a gigabitEthernet transceiver.
 6. A digital element, comprising: a. an input datapath having a plurality of input bit locations arranged in apredetermined input bit-order sequence; b. an output data path having aplurality of output bit locations arranged in a predetermined outputbit-order sequence; c. a first functional element adapted to perform afirst preselected function, and coupled to the input data path; d. asecond functional element adapted to perform a second preselectedfunction, and coupled to the output data path; and e. an intermediateinterconnect coupling the first functional element to the secondfunctional element, the first functional element being adapted toprovide output bit locations having a permuted bit-order sequence, suchthat the intermediate interconnect between the first functional elementand the second functional element is rendered substantially direct andof minimized length, wherein the first and second functional elementscomprise an adaptive filter.
 7. A digital element, comprising: a. aninput data path having a plurality of input bit locations arranged in apredetermined input bit-order sequence; b. an output data path having aplurality of output bit locations arranged in a predetermined outputbit-order sequence; c. a first functional element adapted to perform afirst preselected function, and coupled to the input data path; d. asecond functional element adapted to perform a second preselectedfunction, and coupled to the output data path; and e. an intermediateinterconnect coupling the first functional element to the secondfunctional element, the first functional element being adapted toprovide output bit locations having a permuted bit-order sequence, suchthat the intermediate interconnect between the first functional elementand the second functional element is rendered substantially direct andof minimized length, wherein the first and second functional elementsare operably disposed in a gigabit Ethernet transceiver.
 8. A digitalfilter tap module, comprising: a. an input data path; b. an output datapath; c. a first functional element being adapted to perform a firstpreselected function, having bit locations, and being coupled to theinput data path; d. a second functional element adapted to perform asecond preselected function, and coupled to the output data path; and e.an intermediate interconnect coupling the first functional element tothe second functional element, the first functional element beingadapted to provide a permuted bit-order sequence on selected bitlocations coupled to the intermediate interconnect, such that theintermediate interconnect between the first functional element and thesecond functional element is rendered substantially direct and ofminimized length, wherein the first and second functional elementscomprise a finite impulse response (FIR) filter.
 9. The digital filtertap module of claim 8, wherein the finite impulse response filtercomprises a direct-transposed FIR filter.
 10. The digital filter tapmodule of claim 9, wherein the FIR filter is operably disposed in agigabit Ethernet transceiver.
 11. The digital filter tap module of claim8, wherein the FIR filter is operably disposed in a gigabit Ethernettransceiver.
 12. An area-efficient finite impulse response (FIR) filter,comprising: a. an accumulator having a permuted bit-order output; and b.a multiplier, coupled to receive the permuted bit-order output of theaccumulator; wherein permuted bit-order output of the accumulator iscoupled to the multiplier such that an interconnect between theaccumulator and the multiplier is rendered substantially direct, andminimizing an area of the FIR filter thereby.
 13. The area-efficient FIRfilter of claim 12, wherein the filter is a direct-transposed FIRfilter.
 14. The area-efficient FIR filter of claim 13, wherein the FIRfilter is disposed in a gigabit Ethernet system.
 15. The area-efficientFIR filter of claim 13, wherein the FIR filter is disposed in acommunication system.
 16. A digital element, comprising: a. an inputdata path having a plurality of input bit locations arranged in apredetermined input bit-order sequence; b. an output data path having aplurality of output bit locations arranged in a predetermined outputbit-order sequence; c. a first functional element adapted to perform afirst preselected function, and coupled to the input data path; d. asecond functional element adapted to perform a second preselectedfunction, and coupled to the output data path; and e. an intermediateinterconnect coupling the first functional element to the secondfunctional element, the first functional element being adapted toprovide output bit locations having a permuted bit-order sequence, suchthat the intermediate interconnect between the first functional elementand the second functional element is rendered substantially direct andof minimized length, and wherein the first and second functionalelements are operably disposed in a gigabit Ethernet transceiver. 17.The digital element of claim 16, wherein one of the first and secondfunctional elements comprises a plurality of logic units, ones of theplurality of logic units including one of an AND device, an OR device,an XOR device, a NAND device, a NOR device, a NEXOR device, an inverter,and a combination thereof.
 18. The digital element of claim 16, whereinone of the first and second functional elements comprises one of anaccumulator, a multiplier, a divider, an adder, a counter, a shifter, adecoder, a controller, a multiplexer, a logic array, a storage element,and a combination thereof.
 19. The digital element of claim 18, whereinanother one of the first and second functional elements comprises adigital filter tap.
 20. The digital element of claim 19, wherein anotherone of the first and second functional elements comprises a finiteimpulse response (FIR) filter.
 21. The digital element of claim 20,wherein the FIR filter is operably disposed in the gigabit Ethernettransceiver.
 22. The digital element of claim 20, wherein the finiteinpulse response filter comprises a direct-transposed FIR filter. 23.The digital element of claim 22, wherein the FIR filter is operablydisposed in the gigabit Ethernet transceiver.
 24. The digital element ofclaim 16, wherein the one of the first and second functional elementscomprises an arithmetic logic unit.
 25. The digital element of claim 16,wherein one of the first and second functional elements comprises one ofa general purpose processor and a digital signal processor.
 26. Thedigital element of claim 16, wherein the first and second functionalelements comprise an adaptive filter.
 27. A method of generating alayout for a digital module, comprising: selectively permuting bitlocations of a functional element to provide a predetermined bit-orderdiscontinuity thereof, and creating permuted bit locations thereby,wherein the functional element is operably disposed in the module, andthe selectively permuting results in the module having improved spacialefficiency; transposing selected bit locations of the functional elementcreating transposed bit locations; and coupling an input data path andan output data path to the functional element, wherein the functionalelement is an accumulator.
 28. The method of claim 27, furthercomprising: coupling a first component to the functional element bitlocations having the predetermined bit-order discontinuity.
 29. Themethod of claim 28, wherein the first component is a multiplier.
 30. Themethod of claim 27, further comprising: coupling a first component tothe output data path, the output data path being coupled to functionalelement bit locations having a predetermined bit-order discontinuity.31. The method of claim 27, further comprising: coupling a firstcomponent to the input data path, the input data path being coupled tothe transposed bit locations.
 32. The method of claim 31, furthercomprising: coupling a second component with the output data path, theoutput data path being coupled with the permuted bit locations.
 33. Themethod of claim 32, wherein the functional element is an accumulator,the first component is a first multiplier, and the second component is asecond multiplier.
 34. The method of claim 27, wherein the digitalmodule is one of an arithmetic logic unit, a computational module, afilter tap module, a transceiver, a communication system, and acombination thereof.
 35. A method of generating a layout for a digitalmodule, comprising: selectively permuting bit locations of anaccumulator to provide a predetermined bit-order discontinuity thereof,and creating permuted bit locations thereby, wherein the accumulator isoperably disposed in the module, and the selectively permuting resultsin the module having improved spacial efficiency; transposing selectedbit locations of the accumulator creating transposed bit locations;coupling an input data path and an output data path to the accumulator;and coupling a first component to the input data path, the input datapath being coupled to the transposed bit locations.
 36. The method ofclaim 35, wherein the first component is a multiplier.
 37. A method ofgenerating a layout for a digital module, comprising: selectivelypermuting the bit locations of an accumulator to provide a predeterminedbit-order discontinuity thereof, and creating permuted bit locationsthereby, wherein the accumulator is operably disposed in the module, andthe selectively permuting results in the module having improved spacialefficiency; transposing selected bit locations of the accumulatorcreating transposed bit locations; coupling an input data path and anoutput data path to the accumulator; coupling a first multiplier to theinput data path, the input data path being coupled to the transposed bitlocations; and coupling a second multiplier with the output data path,the output data path being coupled with the permuted bit locations.